496 research outputs found

    Minimizing Communication in Linear Algebra

    Full text link
    In 1981 Hong and Kung proved a lower bound on the amount of communication needed to perform dense, matrix-multiplication using the conventional O(n3)O(n^3) algorithm, where the input matrices were too large to fit in the small, fast memory. In 2004 Irony, Toledo and Tiskin gave a new proof of this result and extended it to the parallel case. In both cases the lower bound may be expressed as Ω\Omega(#arithmetic operations / M\sqrt{M}), where M is the size of the fast memory (or local memory in the parallel case). Here we generalize these results to a much wider variety of algorithms, including LU factorization, Cholesky factorization, LDLTLDL^T factorization, QR factorization, algorithms for eigenvalues and singular values, i.e., essentially all direct methods of linear algebra. The proof works for dense or sparse matrices, and for sequential or parallel algorithms. In addition to lower bounds on the amount of data moved (bandwidth) we get lower bounds on the number of messages required to move it (latency). We illustrate how to extend our lower bound technique to compositions of linear algebra operations (like computing powers of a matrix), to decide whether it is enough to call a sequence of simpler optimal algorithms (like matrix multiplication) to minimize communication, or if we can do better. We give examples of both. We also show how to extend our lower bounds to certain graph theoretic problems. We point out recently designed algorithms for dense LU, Cholesky, QR, eigenvalue and the SVD problems that attain these lower bounds; implementations of LU and QR show large speedups over conventional linear algebra algorithms in standard libraries like LAPACK and ScaLAPACK. Many open problems remain.Comment: 27 pages, 2 table

    Guidance, Flight Mechanics and Trajectory Optimization. Volume 15 - Application of Optimization Techniques

    Get PDF
    Pontryagin maximum principle, calculus of variations, and dynamic programming optimization techniques applied to trajectory and guidance problem

    Guidance, flight mechanics and trajectory optimization. Volume 6 - The N-body problem and special perturbation techniques

    Get PDF
    Analytical formulations and numerical integration methods for many body problem and special perturbative technique

    Element orbitals for Kohn-Sham density functional theory

    Full text link
    We present a method to discretize the Kohn-Sham Hamiltonian matrix in the pseudopotential framework by a small set of basis functions automatically contracted from a uniform basis set such as planewaves. Each basis function is localized around an element, which is a small part of the global domain containing multiple atoms. We demonstrate that the resulting basis set achieves meV accuracy for 3D densely packed systems with a small number of basis functions per atom. The procedure is applicable to insulating and metallic systems

    Chebyshev polynomial filtered subspace iteration in the Discontinuous Galerkin method for large-scale electronic structure calculations

    Full text link
    The Discontinuous Galerkin (DG) electronic structure method employs an adaptive local basis (ALB) set to solve the Kohn-Sham equations of density functional theory (DFT) in a discontinuous Galerkin framework. The adaptive local basis is generated on-the-fly to capture the local material physics, and can systematically attain chemical accuracy with only a few tens of degrees of freedom per atom. A central issue for large-scale calculations, however, is the computation of the electron density (and subsequently, ground state properties) from the discretized Hamiltonian in an efficient and scalable manner. We show in this work how Chebyshev polynomial filtered subspace iteration (CheFSI) can be used to address this issue and push the envelope in large-scale materials simulations in a discontinuous Galerkin framework. We describe how the subspace filtering steps can be performed in an efficient and scalable manner using a two-dimensional parallelization scheme, thanks to the orthogonality of the DG basis set and block-sparse structure of the DG Hamiltonian matrix. The on-the-fly nature of the ALBs requires additional care in carrying out the subspace iterations. We demonstrate the parallel scalability of the DG-CheFSI approach in calculations of large-scale two-dimensional graphene sheets and bulk three-dimensional lithium-ion electrolyte systems. Employing 55,296 computational cores, the time per self-consistent field iteration for a sample of the bulk 3D electrolyte containing 8,586 atoms is 90 seconds, and the time for a graphene sheet containing 11,520 atoms is 75 seconds.Comment: Submitted to The Journal of Chemical Physic

    Phase I Clinical Trials in Acute Myeloid Leukemia: 23-Year Experience From Cancer Therapy Evaluation Program of the National Cancer Institute

    Get PDF
    Therapy for acute myeloid leukemia (AML) has largely remained unchanged, and outcomes are unsatisfactory. We sought to analyze outcomes of AML patients enrolled in phase I studies to determine whether overall response rates (ORR) and mortality rates have changed over time

    Lessons Learned from Implementing Service-Oriented Clinical Decision Support at Four Sites: A Qualitative Study

    Get PDF
    Objective To identify challenges, lessons learned and best practices for service-oriented clinical decision support, based on the results of the Clinical Decision Support Consortium, a multi-site study which developed, implemented and evaluated clinical decision support services in a diverse range of electronic health records. Methods Ethnographic investigation using the rapid assessment process, a procedure for agile qualitative data collection and analysis, including clinical observation, system demonstrations and analysis and 91 interviews. Results We identified challenges and lessons learned in eight dimensions: (1) hardware and software computing infrastructure, (2) clinical content, (3) human-computer interface, (4) people, (5) workflow and communication, (6) internal organizational policies, procedures, environment and culture, (7) external rules, regulations, and pressures and (8) system measurement and monitoring. Key challenges included performance issues (particularly related to data retrieval), differences in terminologies used across sites, workflow variability and the need for a legal framework. Discussion Based on the challenges and lessons learned, we identified eight best practices for developers and implementers of service-oriented clinical decision support: (1) optimize performance, or make asynchronous calls, (2) be liberal in what you accept (particularly for terminology), (3) foster clinical transparency, (4) develop a legal framework, (5) support a flexible front-end, (6) dedicate human resources, (7) support peer-to-peer communication, (8) improve standards. Conclusion The Clinical Decision Support Consortium successfully developed a clinical decision support service and implemented it in four different electronic health records and four diverse clinical sites; however, the process was arduous. The lessons identified by the Consortium may be useful for other developers and implementers of clinical decision support services

    R-matrix Floquet theory for laser-assisted electron-atom scattering

    Get PDF
    A new version of the R-matrix Floquet theory for laser-assisted electron-atom scattering is presented. The theory is non-perturbative and applicable to a non-relativistic many-electron atom or ion in a homogeneous linearly polarized field. It is based on the use of channel functions built from field-dressed target states, which greatly simplifies the general formalism.Comment: 18 pages, LaTeX2e, submitted to J.Phys.
    • …
    corecore